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A system is disclosed that services a plurality of queues 
associated with respective data connections in a packet 
communication network such that the system guarantees 
data transfer delays between the data source and the desti- 
nation of each data connection. This is achieved in two 
stages. The first stage shapes the traffic of each connection 
such that it conforms to a specified envelope. The second 
stage associates timestamps with the packets released by the 
first stage and chooses for transmission from among them 
the one with the smallest timestamp. Both stages are asso- 
ciated with a discrete set of delay classes. The first stage 
employs one shaping structure per delay class. Each shaping 
structure in turn supports a discrete set of rates and employs 
a FIFO of connections per supported rate. A connection may 
move between FIFOs corresponding to different rates as its 
rate requirement changes. The second stage associates with 
each packet exiting the first stage a timestamp given by the 
exit time from the first stage and the delay class to which the 
connection belongs. A queue of packets is maintained per 
delay class, and the scheduler selects for transmission from 
among the packets at the head of the queues the one with the 
smallest timestamp. 
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GUARANTEEING DATA TRANSFER DELAYS Telephone Network, Addison-Wesley, Ithaca, N.Y, 1996; H. 

IN DATA PACKET NETWORKS USING Zhang, "Service Disciplines for Guaranteed Performance 

EARLIEST DEADLINE FIRST PACKET Service in Packet-Switching Networks," Proceedings of the 

SCHEDULERS IEEE > PP- 1374-1396, October 1995. 

5 EDF scheduling has been known for many years in the 
CROSS-REFERENCE TO RELATED context of processor scheduling as disclosed in C. L. Liu and 

APPLICATIONS J* W. Way land, "Scheduling algorithms for multiprogram- 

ming in a hard real time environment," Journal of ACM, pp. 
This application claims the benefit of the filing date of 46-61, January 1973. Furthermore, it has been more 
U.S. provisional application No. 60/085,547, filed on May 1Q recently proposed as a possible packet-scheduling discipline 
15, 1998 as attorney docket no. Chiussi 12-1. for broadband networks as disclosed in D. Ferrari and D. 

Verma, "A Scheme for Real-Time Channel Establishment in 
FIELD OF THE INVENTION Wide-Area Networks," IEEE Jour. SeL Areas Commun., pp. 

368-379, April 1990; D. Verma, H. Zhang, D. Ferrari, 
The present invention relates to a system for scheduling "Guaranteeing Delay Jitter Bounds in Packet Switching 
packets in packet networks and, more particularly, to guar- 15 Networks/ > Proc , TRJCOMM, pp. 35-46, Chapel Hill, N.C, 
anteeing data transfer delays from data sources to destina- October 1991. The EDF scheduling discipline generally 
tl0ns * works as follows: each connection i at a switch k is asso- 



BACKGROUND 



ciated with a local delay deadline d ( . ; then an incoming 
packet of connection i arriving to the scheduler at time t is 
FIG. 1 shows a packet network in which a plurality of 20 stamped with a deadline t+d*, and packets in the scheduler 
switches 2 are connected to each other by communication are served by increasing order of their deadline, 
links 8. A number of data sources 4 and destinations 6 are For a single switch, EDF is known to be the optimal 
connected to the communication switches 2. From time to scheduling policy as disclosed in L, Georgiadis, R. Guerin, 
time, a network connection is established or torn down from ^ and A, Parekh, "Optimal Multiplexing on a Single Link: 
each of these data sources 4 to a corresponding destination. Delay and Buffer Requirements," RC 19711 (97393), IBM T 
The connection establishment process involves one of the J. } Watson Research Center, August 1994; J. Liebeherr, D. 
data sources 4 sending a packet including control informa- Wrege, and D. Ferrari, "Exact Admission Control for Net- 
lion that indicates one of the destinations 6 to which it works with a Bounded Delay Service," IEEE/ACM Trans. 
desires the connection and the desired envelope of the data 3Q Networking, pp. 885-901, December 1996. Optimality is 
traffic it agrees to send on the connection, along with the defined in terms of the schedulable region associated with 
desired delay bounds at each of the communication switches the scheduling policy. Given N connections with traffic 
2 on the path to the destination. The above desired connec- envelopes A ( - (t) (i«l, 2, . . . , N) sharing an output link, and 
tion and envelope are specified in terms of leaky-bucket givefl ft yector of ^ bmmds ^_ (d ^ _ ^ d ^ where 
parameters as disclosed in R. Cruz, "A Calculus for Network 35 d is an upper bcmnd Qn the ^^ling delay that packets of 
Delay, Part II: Network Analysis," IEEE Transactions on wwBdiaa i can tolerate, the schedulable region of a sched- 
Information Theory, pp. 121-141, January 1991. For the _» 
tear-down of a connection, the data source sends a packet uhng discipline jt is defined as the set of all vectors D that 
including control information indicating that the connection are schedulable under n. EDF has the largest schedulable 
needs to be torn down. region of all scheduling disciplines, and its non-preemptive 
When one of the switches 2 in the network receives a data *° v ™ (WED^ has the largest schedulable region of all 
packet indicating that a connection needs to be established, °°"-P rc ?. m P tlve P ohcie ?- ™ c schedulab ^ rc S 10n f ^ 
the switch executes a call admission control (CAC) proce- ™EDF P ohc y consists of those vectors that satisf y the 
dure to determine whether or not the delay required by the Allowing constraints: 
connection can be guaranteed by the network, If the result of 45 

such a procedure in every switch on the path of the con- - m L 

nection in the network indicates that the delay can be r 

guaranteed by the network, then the connection is accepted n _ L (2) 

in the network. On the other hand, if the result of such a L + 2j A,{t ~ d,) * 7 * r * dN 

procedure in at least one of the switches 2 on the path of the 50 1=1 

connection in the network indicates that the delay cannot be N (3) 

guaranteed by the network, then the connection is not £ Ai[r~di)zrt y tzd N 

accepted in the network. ,=1 

The provision of quality -of -service (QoS) guarantees, 
such as bandwidth, delay, jitter, and cell loss, to applications 55 where d,^d 2 g . . . ^d N , L is the packet size (if the packet 
of widely different characteristics is a primary objective in size is variable, then L is the maximum packet size), r is the 
emerging broadband packet-switched networks. In such link rate, and A,(t)=0 for t<0. Within a single node, once the 
networks, packet-scheduling disciplines are necessary to traffic envelopes are known, a 100% link utilization can be 
satisfy the QoS requirements of delay-sensitive applications, achieved (at least in principle) with this characterization, 
and they ensure that real-time traffic and best-efifort traffic 60 The difficulties arise in a multi-switch or multi-node 
can coexist on the same network infrastructure. Among the network where the traffic envelopes are no longer deter- 
scheduling algorithms that have been proposed in literature, mined at the inputs of the nodes inside the network, and the 
two classes of schemes have become popular: those based interactions that distort the traffic are not easily character- 
on generalized processor sharing (GPS) and those based on izable. This problem is not peculiar of EDF, but is common 
earliest deadline first (EDF). For a survey of these 65 to any scheduling discipline. As a general framework to 
algorithms, see S. Keshav, An Engineering Approach to handle the multi-node problem, H. Zhang and D. Ferrari, 
Computer Networking, ATM Networks, the Internet, and the "Rate-Controlled Service Disciplines," Jou r. High Speed 
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Networks, pp. 389-412, 1994, propose a class of schemes 

called rate -controlled service (RCS) disciplines which n ^ 

reshape the traffic at each hop within the network. As S W = L+ 2j 
schematically shown in FIG. 2, an RCS server 10 has two 

components: a shape r 12 which reshapes the traffic of each 5 

connection and a scheduler 14 which receives packets where the term L accounts for the no n -preemptive nature of 

released by the shaper and schedules them according to a the scheduler). If the aggregate service demand service S(t) 

specific scheduling discipline, such as EDF as disclosed in never exceeds the server capacity given by R(t)-rt for t*L/r, 

L. Georgiadis, R. Guerin, V. Peris, and K. Sivarajan, "Effi- the packets can be scheduled such that none misses its 

cient Network QoS Provisioning Based on per Node Traffic 10 deadline. FIG. 3 illustrates the service capacity and service 

Shaping," IEEE/ACM Trans. Networking, pp. 482-501, demand curves for a simple example with two leaky-bucket 

August 1996 ("Georgiadis et al.»), who build upon this constrained connections Since the following relationship, 

model and derive expressions for the end-to-end delay ^ "^^^ 

bounds in terms of the shaper envelope and scheduling delay 15 f h can be admitted Wlth S^toed delay bounds d, and 

at each node. Hiey also show the following useful properties »' ^ multi . node case * as 
nf RCS 

follows. For an incoming connection j_ with traffic arrival 

Identical shapers at each switch along the path of a envelope at the edge of the network A,(t) and end-to-end 

connection i (i.e., shapers having identical shaper envelopes delay requirement d,-, the end-to-end CAC algorithm per- 

for connection i) produce end-to-end delays that are no 20 forms the following steps to determine if the connection can 

worse than those produced by different shapers at each be accommodated in the network: _ 

switch. Therefore, for any given connection, identical 1. It chooses an appropriate shaper with envelope E ( {t) for 

shapers can be used at each node. This shaper envelope the_connection, and computes the corresponding delay D( 

common to all shapers for connection i is denoted as E ( - (t). AJ|E^). The delay computation is described in Georgiadis 

25 etal. 



The end -to -end delay bound for connection i is given by: 

(4) 



30 



. The k-th switch on the path is assigned a delay bound 
such that 

D(fff||E f ) + £ 4 =A-A 



where D,--D(AJ|E^) denotes the maximum shaper delay, and 

d * is the bound on the scheduler delay for packets of single-node schedulability check according to the schedu- 

connection i at the k-th switch on its path. The maximum lability criterion of Equation (5) is performed (using enve- 

shaper delay is incurred only once and is independent of the 35 lope E t <t) and delay bound df) at each switch on the path, 

number of nodes on the path. The total scheduler delay is the 3. The connection is admitted only if every switch on the 

sum of the individual scheduling delays d * at each node. path can accommodate the connection. 

When EDF scheduling is used together with per-node The results in Georgiadis et al. can be used for choosing 

reshaping (this combination is referred to as RC-EDF), and the shaper envelope and for splitting the total scheduling 

the delay components in Equation (4) are properly chosen, 40 delay among the schedulers on the path. For leaky-bucket- 

the same delay bounds as GPS can be achieved. constrained sources with traffic arrival envelope 

Hie above properties combined with Equations (1-3) A,<t)=a f + P/ t, generalized processor sharing performance can 

enable a call admission control (CAC) framework that be matched by choosing shaper envelope 

decides if connections may or may not enter the network ^<0KL+g,t,a /+Pi .t), and assigning local delay bound d*-U 

while ensuring that end-to-end performance constraints are 45 g f +L/r, to the k-th switch on the path, where r* is the link 

met. How CAC works is first analyzed for an isolated EDF rate and 8/ 15 the rate allocated to the connection at each 

scheduler, and then the analysis proceeds to the multi-node switch. 

case A potentially serious problem is that the implementation 

T . t . , „ t . /-»\ j /^\ ■ j ■ , i I j of an RC-EDF server, which consists of a traffic shaper and 

In a single switch, Equations (2) and (3) immediately lead ' £ . 

, ur.Z mr • \u * \ 50 an EDF scheduler, can be quite complex. Without techniques 

to a CAC scheme. With RC-EDF, given the traffic envelope J " , ' u \ , ... ff ;* 

r* /.v / r j . . , \ j i i j i u j / to reduce this complexity, the scheme would be unaflordable 

E: (t) (enforced by the shaper) and local delay bound for c *. , , 4 . t , 

u \ .u «• u • w 1 j r 1, *u„ M practice for application to current packet switches, 

each of the connections being multiplexed on a link of the r rr r 

switch, the equations are combined into: SUMMARY OF THE INVENTION 

n 55 ^ * s an °kj ect °f tDe P resent invention to provide a method 

L+y EiU-di) zn^t*- anc * an apparatus to implement a EDF-related packet server 

6i ' ' * r of minimum complexity, comprising a shaper and a 

scheduler, which guarantees a small value of the maximum 
data transfer delay,to each connection, 

where d. ^L/r, i-1,2, . . . N. 60 The EDF-related packet server (alternately called the 

Equation (5) can be graphically interpreted to yield a RC-EDF server) provides a shaper and a scheduler. In 

simple single -switch CAC scheme. E ( - (t-d,) is the curve accordance with a first aspect of the invention, the shaper 

obtained by shifting the connection i arrival envelope curve holds and releases packets such that the traffic belonging to 

E ( {t) to the right by its delay bound d ( -, and denotes the each connection exiting the shaper conforms to a pre- 

minimal service (t) required by connection i in order to 65 specified envelope, while the scheduler at each-scheduling 

meet its local delay bound. The aggregate service demand of instant selects from among the packets released by the 

all connections at the scheduler is thus given by shaper the one to be transmitted next on the outgoing link. 
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To reduce the implementation complexity of both the shaper 
and the scheduler, the RC-EDF server supports a (relatively 
small) discrete set of delay classes. Each delay class is 
associated with a delay value, and this delay value corre- 
sponds to the delay guarantee provided by the scheduler to 
the packets of each connection contained in the delay class. 

The shaper provides one shaping structure per delay class 
(each of such structures is referred to as delay class shaper), 
each of which shapes all connections corresponding to a 
specific delay class. Each delay class shaper supports a 
discrete number of shaping rates, each of which is associated 
with a first- in-first -out (FIFO) queue. Each shaping rate 
corresponds to a possible value of the slope corresponding 
to a piece of a piecewise-linear envelope. Each connection 
which has at least one packet in the shaper is associated with 
exactly one of the above-mentioned FIFO queues. The delay 
class shaper has a two-level hierarchical structure. The lower 
level of the hierarchy uses FIFO queues, one per rate. At the 
higher level of the hierarchy, the timestamps associated with 
each FIFO queue are used to select from among the different 
FIFO queues for that delay class. Based on the timestamps, 
the delay-class shaper selects one FIFO queue and sends the 
first packet of the connection at the head of the selected 
FIFO to the scheduler. 

The scheduler maintains a queue of packets for each delay 
class. Each packet is associated with a timestamp derived 
from its release time by the shaper and the delay bounds 
guaranteed to the connection. A sorter selects from among 
the packets at the head of the FIFOs the one with the 
minimum timestamp for transmission on the outgoing link. 

In accordance with another aspect of the invention, the 
EDF-related packet server provides a different shaper from 
the embodiment according to the first aspect of the inven- 
tion. The scheduler is identical to the embodiment according 
to the first aspect of the invention. The shaper provides one 
shaping structure per delay class (referred to as delay class 
shaper), each of which shapes all connections corresponding 
to the delay class. Each delay class shaper supports a 
discrete number of shaping rates, each of which is associated 
with a FIFO queue. Each connection that has at least one 
packet in the shaper could be associated with a multiplicity 
of the FIFO queues. This is in contrast to the first aspect of 
the invention where each connection with at least one packet 
in the shaper is associated with exactly one FIFO queue. The 
delay class shaper has a two-level hierarchical structure. The 
lower level of the hierarchy uses FIFO queues, one per rate. 
At the higher level of the hierarchy, the timestamps associ- 
ated with each FIFO queue are used to select from among 
the different FIFO queues for that delay class. 

These and other aspects of the invention will become 
apparent in the ensuing detailed description taken in con- 
junction with the accompanying figures, which disclose a 
number of preferred embodiments of the invention. 

BRIEF DESCRIPTION OF THE FIGURES 

FIG. 1 illustrates a packet network in which a number of 
switches, data sources, and destinations are connected. 

FIG. 2 shows a schematic of the rate controlled service 
(RCS) discipline server illustrating the concept of traffic 
reshaping at each switch in the data network. 

FIG. 3 illustrates an example of the schedulability tests for 
a single node. 

FIG. 4 illustrates a communication switch in the packet 
network. 

FIG. 5 is a block diagram of the communication link 
interface according to a first embodiment of the present 



12,213 Bl 

6 

invention for scheduling the transmission of data packets in 
a communication link. 

FIG. 6 is a block diagram of the connection controller that 
is part of the communication link interface of FIG. 5. 
5 FIG, 7 is a block diagram of the server that is part of the 
communication link interface of FIG. 5. 

FIG. 8 is a block diagram of the delay class shaper that is 
part of the server of FIG. 7. 
10 FIGS. 9A-9G show in flowchart form a first method of 
scheduling the transmission of data packets in a communi- 
cation link interface of FIG. 5 in accordance with the 
principles of the present invention. 
FIG. 10 shows in flowchart form a method of computing 
15 the release time of an incoming packet by a leaky bucket in 
accordance with the principles of the present invention. 

DETAILED DESCRIPTION OF PREFERRED 
EMBODIMENTS 

20 FIG. 4 shows a communication switch 20 in a packet 
network. The communication switch 20 includes a plurality 
of input communication link interfaces 24, each of which 
connects a plurality of input links 22 to an output link 27. 
The output links 27 of these input communication link 

25 interfaces 24 serve as inputs to a switch fabric 26. The 
switch fabric 26 processes the link signals and outputs 
another set of output links 28 to a plurality of output 
communication link interfaces 30. The output communica- 
tion link interfaces 30 output ultimate output link signals 32 

30 of the communication switch 20. 

FIG. 5 shows a block diagram illustrating a first embodi- 
ment of each input communication link interface 24 of FIG. 
4 according to the present invention. The input communi- 
cation link interface 24 includes a data packet receiver 40, 

35 which receives the data packets arriving from the input links 
22. The receiver 40 uses the contents of the connection 
identifier field contained in the header of each packet to 
identify its respective connection i. All packets that the 
receiver 40 receives have the same length. The receiver 40 

40 thereafter sends each packet to a corresponding connection 
controller 42. Each connection controller 42 stores a set of 
information related to a particular connection i, A leaky - 
bucket processor 44 determines a shaping rate r, that a 
particular packet desires. Subsequently, a delay-class and 

45 rate -group identifier 46 determines a delay class j to which 
the connection belongs as well as an index i corresponding 
to the shaping rate r,. A server 48 groups connections into a 
discrete set of a relatively smal l, predetermined number of 
de lay ^lass^^ 50 transmits the outpu t 

50 *"oTthe scrver~48 to the corresponding output link 27. 

FIG. 6 illustrates one preferred embodiment of each 
connection controller 42 of FIG, 5 for a connection i 
according to the current invention. In other words, for each 

55 connection, the same structure is duplicated. Each connec- 
tion controller 42 includes the following: 
(al l a connection _queue_52^ which is used to store the 
received data packets of connection i, with registers 
Qtail_Reg 60 and Qhead_Reg 62 containing pointers 

60 to the head and tail, respectively, of the connection 
queue 52, 

(b) a register Backlog__Reg 54, which indicates whether 
or not the connection is backlogged (i.e., has at least 
one packet in its queue), 
65 (c) a register Shaper_Index__Reg 56, which stores an 
index of the rate group to which the connection 
belongs, and 
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(d) a register Pivot_Reg 58, which indicates whether or 
not the connection is a pivot (described below). 

FIG. 7 illustrates one preferred embodiment of the server 
48 of FIG. 5 according to the current invention. In general, 
the server 48 groups connections into a discrete set of a 
relatively small, predetermined number C of delay classes. 
The server 48 includes at least two main components: a 
shaper 63 having C delay-class shapers 64 and a scheduler 
68. Each delay class is associated with a delay value, and 
this delay value corresponds to the delay guarantee provided 
by the scheduler 68 to the packets of each connection in the 
delay class. The delay-class shapers 64 provide one sched- 
uling structure per delay class. 

Similarly, for each delay class, the scheduler 68 includes 
a class queue 70 for queuing data packets, a register 
Timestamp_Reg 72 for storing timestamp information, and 
a register Class„Delay_Jleg 74 for storing guaranteed 
time -delay information. The server 48 also includes a sorter 
76 for sorting the time-stamped information and a selector 
bypass unit 78 for inserting packet data from a bypass queue 
79 into a corresponding class queue 70. 

When a packet arrives for connection at an input com- 
munication link interface 24 of FIG. 4, the leaky-bucket 
processor 44 of FIG. 5 determines the shaping rate r, which 
is desired by the packet. Additionally, the delay-class and 
rate-group identifier 46 determines a delay class j as well as 
a rate class i. If the leaky -bucket, computation reveals that 
the packet is to be directly sent to a scheduler and the 
connection has no backlog, then the packet is inserted into 
the bypass queue 79 of FIG. 7. The selector bypass unit 78 
subsequently picks up the packet from the bypass queue 79 
and inserts it at the tail of the appropriate class queue 70 in 
the scheduler 68. If, on the other hand, the leaky-bucket 
computation reveals that the packet is not to be released 
immediately, or if the connection is backlogged (i.e., the 
corresponding connection queue 52 of FIG. 6 is not empty), 
then the packet at the tail of the corresponding connection 
queue 52 is marked with a tag equal to i, and the incoming 
packet is inserted at the tail of the corresponding connection 
queue 52. 

FIG. 8 illustrates one preferred embodiment of the struc- 
ture of each delay-class shaper 64 of FIG. 7 according to the 
current invention. Each delay-class shaper 64 shapes all the 
connections in the corresponding delay class and supports a 
discrete number G of shaping rates r x through r^. The total 
or overall rate at which each delay-class shaper 64 operates 
for releasing packets is equal to the link rate r. This is 
because the sum of the shaping rates of the connections in 
a delay class cannot exceed the link rate r. In the j-th 
delay-class shaper 64 shown in FIG. 8, the i-th shaping rate 
r ( is associated with a corresponding rate group Class_j_ 
Group__i 65. Connections belonging to delay class j that 
need to be shaped at the rate r, are queued in the FIFO_j_i 
80, which corresponds to the rate group Class_j_Group_i 
65, and the selection in the delay-class shaper j is performed 

only among connections at the head or top of FIFO j i 80, 

where i=l through G. In this preferred embodiment, each of 
the FIFOs 80 is implemented as a link list and has a 
corresponding tail pointer stored in register Ftail_Reg 82 
for indicating the end of the queue and a corresponding head 
pointer stored in register Fhead_Reg 84 for indicating the 
top of the queue. 

A selector 88, located within delay-class shaper 64, 
selects, at each time slot of length L/r, which is equal to the 
transmission time of a packet from the FIFOs 80 corre- 
sponding to the rate groups within the corresponding delay 
class j, the connection at the head of a nonempty FIFO 80 
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that has the minimum eligible timestamp stored in the 
corresponding register Timestamp_Group_Reg 86 among 
all rate groups belonging to that delay class j. An eligible 
queue timestamp is defined to be one whose value is not 

5 greater than the current time. Every time a FIFO is selected 
and a packet is released from the queue of the connection at 
the head of the selected FIFO into the scheduler 68 of FIG. 
7, the corresponding timestamp stored in Timestamp_ 
Group_Reg 86 of FIG. 8 is incremented by L/n jV r,., where 

io L is the packet length, n yV is the number of connections 
queued in FlFO_J_J, and r, is the rate corresponding to the 
rate group. The number of connections n J( - is stored in a 
corresponding register Nconn__Reg_j_j 90, while the rate 
r, is stored in the register Rate Reg j i 92. If the connec- 
ts tion that has been selected remains backlogged and requires 
the same service rate r ( - as indicated by the tag associated 
with the packet, then the packet is reinserted at the tail of 
FIFO_j_J 80. 
Each connection is shaped according to a piecewise-linear 

20 shaping envelope, which is the combination of one or more 
leaky -bucket envelopes as disclosed in R. Cruz, "A Calculus 
for Network Delay, Part II: Network Analysis," IEEE Trans- 
actions on Information Theory, pp. 121-141, January 1991, 
and each of the leaky-bucket envelopes is associated with a 

25 corresponding shaping rate. Because the shaping envelopes 
for the connections are piecewise-linear functions, the 
desired shaping rate may change while a connection is 
backlogged. Consequently, a connection that is queued in a 
given FIFO corresponding to a particular rate group in a 

30 delay-class shaper may, once served, have to be queued in a 
different FIFO corresponding to a different rate group within 
the same delay-class shaper. Such an occurrence is referred 
to as a rate jump. These jumps are computed and stored, 
preferably within a tag portion of the packet by the leaky - 

35 bucket processor 44 of FIG. 5. 

To account for the variations in rate or rate jumps expe- 
rienced by backlogged connections, the concept of a pivot 
connection is introduced to each of the rate groups having a 
nonempty FIFO__j_j 80 of FIG. 8. Every nonempty FIFO__ 

40 j_i 80 has one of its connections marked as a pivot, and a 
corresponding flag is stored by each connection controller 
42 of FIG. 5 in a corresponding register Pivot_Reg 58 of 
FIG. 6. Consecutive appearances of the pivot at the head of 
a FIFO_j_i 80 are separated by a L/r,- difference in the 

45 values in register Timestamp_Group_Reg 86 of FIG. 8. 
Connections that are to be inserted into a nonempty FIFO_ 
j_i 80 either by virtue of a rate jump or by becoming newly 
backlogged are stored in an auxiliary queue lnsertq_j_i 94 
of FIG. 8. Registers Iqhead_Reg 98 and Iqtail_Reg 96 store 

so pointers to the head and tail, respectively, of the correspond- 
ing auxiliary queue. The number of such connections in the 
auxiliary queue Insertq_j_i 94 are stored in the register 
Ninserts_Reg__j_i 100. Insertq_j_i 94 is appended to 
. FIFO^j_i 80 when the pivot makes an appearance at the 

55 head of FIFO__j__i 80. Simultaneously, the register Nconn__ 
Reg 90 is updated using the value stored in register Next_ 
Reg 102, which accounts for deletions from FIFO_j_J 80 
and the register Ninserts_Reg 100. 
To properly cope with a change in the shaping rate among 

60 connections, a rate jump is performed only from a higher 
rate to a lower rate within a busy period for the connection. 
That is, a busy period is when the connection is continuously 
backlogged. If a packet for a connection arrives at the server 
while the connection is still backlogged and requests a 

65 shaping rate that is higher than the shaping rate requested by 
the previous packet for that connection, the connection 
continues to be served at the lower rate. However, a packet 
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arriving to a no n -backlogged connection is inserted in the 
shaping FIFO queue corresponding to its requested shaping 
rate, irrespective of whether or not the previous packet of the 
connection was served at a lower rate. The rationale behind 
this is justified by the observation that, in an ideal shaper, a 
connection with a leaky-bucket traffic envelope never 
increases its shaping rate while it is still backlogged. 

The following tables summarize the various blocks and 
structures as shown in FIGS. 5-10. 
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FIFO__c_£ as stored in the register Nconn_Reg_c_g 90 is 
not zero, the register Pivot__Reg__i 58 of FIG. 6 is set to 
FALSE in step S660 and the connection i is inserted into the 
auxiliary queue Insertq__c_g 94 of FIG. 8 in step S670. 
Consequently, the value of the register Ninserts__Reg_c_g 
90 of FIG. 8 is incremented by one in step S680. After the 
packet is marked with TAG=-1 in step S700, the packet is 
inserted in connection queue i 52 of FIG. 6 in step S710. 
Lastly, it is checked whether or not there remains more new 



Delay-Class Shaper j 64 of FIGS. 7 and 8 : 
Timestamp_Groiip_Reg_j_g Tmiestamp for Class j, Group g 



Rate_Reg_j_g 
Nconn_Reg_j_g 
FIFO_j_g 



[nsertq_j_g 



Ninserts__Reg__j_g 
Next_Reg_j_g 
Pflag_Reg_J 
Tflag_Reg_j 
Bypass Queue 
Connection Controller 



Rate at which Class j, Group g shapes 
Number of connections in Class j, Group g 
FIFO of connections in Class j, Group g 
(Fhead__Reg_j_g and Ftail_Reg_J_£ contain pointers 
to the head and tail, respectively) 
Queue of connections waiting for insertion 
in Class j, Group g FIFO (Iqhead — Reg_j_g and 
Iqtail_Reg_j_g contain pointers to the head and tail, 
respectively) 

Number of connections in insertion queue 
Number of connections in FIFO for next round 
Register for temporary storage of pivot flag 
Register for temporary storage of packet tag 
Queue of packets which bypass the shaper 
i 42 of FIGS. 5 and 6: 



Connection Queue i 



Backlog_Reg i 

Shaper_Index_Reg i 

Pivot_Reg_i 
Scheduler 68 of FIG. 7: 



Connection i packet queue (Qhead_Reg_i and 
QtaiLReg_i contain pointers to head and tail, 
respectively) 

Indicates if connection is backlogged 

Index of rate group to which connection currently 

belongs 

Indicates if connection is pivot 



Class_Delay__Reg__i 

Timestarnp_Reg i 

Packet: 



Delay associated with Class i 
Timestamp for Class i 



tag (in shaper) 
tag (in scheduler) 



Indicates shaping rate of next packet of the 
connection 

Timestamp by which the packet has to be 
served 



FIGS. 9A-9G show steps involved in a preferred process 
according to the current invention. Referring first to FIG. 45 
9 A, following the arrival of packets in step S510, one data 
packet is selected in step S520. A connection i and a 
corresponding delay class c are identified, respectively, in 
steps S530 and S540. Leaky-bucket computation is per- 
formed by leaky -bucket processor 44 of FIG. 5 in step S550 50 
so as to determine a dominant leaky bucket in step S560 and 
a shaping rate g desired by the packet in step S570. If, on the 
other hand, no packet arrives at step S510, the process 
proceeds to step S750 in FIG. 9C, which is described below. 

Now referring to FIG. 9B, when the connection is selected 55 
by the selector 88 of FIG. 8 in the corresponding delay-class 
shaper 64 of FIG. 7 or by the selector bypass unit 78 of FIG. 
7, the packet at the head of the connection's queue is to be 
immediately released to the scheduler 68 of FIG. 7 in step 
S580. If so, it is determined whether or not any connection 60 
is backlogged based upon the value in register Backlogs 
Reg__i 54 of FIG. 6 in step S590. Upon confirming a FALSE 
value, the register Backlog_Reg_j 54 is now set to TRUE 
in step S600, and the number of connections in HFO_c_g 
80 of FIG. 8 of the delay class c and the rate g is examined 65 
using the stored value in the register Nconn__Reg__c_g 90 
of FIG. 8 in step S610. If the number of connections in 



data packets in step S740. If data packets remain, the process 
goes back to step S520 to process another data packet. The 
process proceeds to step S750 if no new data packets are 
available. 

On the other hand, if the number of connections in the 
FIFO is zero in step S610, the connection i is inserted in the 
FIFO_c_g 80 in step S620. Consequently, the value of the 
register Nconn_Reg_c_g 90 is incremented by one in step 
S630, and the register Pivot__Reg_i 58 of FIG. 6 is set to 
TRUE in step S640. The value stored in register 
Timestamp__Reg_c__g 72 of FIG. 7 is incremented in step 
S650 by a value L/Rate_Reg_c__g, where L is the packet 
length and the register Rate_Reg_c_g 92 of FIG. 8 con- 
tains the corresponding rate value, and processing proceeds 
to step S700. 

Still referring to FIG. 9B, if the packet is not to be 
immediately released in step S580 and the register 

Backlog_Reg i 54 of FIG. 6 indicates no backlog in step 

S720, the packet is inserted into the bypass queue 79 of FIG. 
7 in step S730, and the process proceeds to step S740. On the 
other hand, if the register Backlog_Reg__i 54 indicates 
some backlog in step S720, the packet at the tail of connec- 
tion queue i 52 of FIG. 6 is marked with TAG g in step S690. 
Step S690 is also performed after it has been determined that 



45 



50 



06/22/2004, EAST Version: 1.4.1 



US 6,532,213 Bl 

11 12 

the packet is to be immediately released in step S580 if the rate z is equal to or greater than the group rate y in step 

register Backlog Reg i 54 in step S590 indicates some S1010, the connection j is inserted in FIFO_p y 80 in step 

backlog. In either case, after step S690, the process proceeds S1020, and the process proceeds to step S1120. 

to step S700. On the other hand, if the rate z is less than the group rate 

Now referring to FIG. 9C, steps related to the bypass 5 y in step S1010, then the value in register Next„Reg_p_y 

queue 79 in FIG. 7 are illustrated according to the current 102 is decremented by one in step S1030. If it is determined 

invention. In step S750, it is determined whether or not the in step S1040 that the register Nconn_Reg_p_z 90 stores 

selector bypass unit 78 is ready. If it is not ready, the process a zero value, the connection j is inserted into FIFO_p_z 80 

skips to step S830. On the other hand, if the selector bypass in step S1050 and the value in register Nconn„Reg__p_z 90 

unit 78 is ready, a packet at the head of the bypass queue 79 10 is incremented by one in step S1060. Subsequently, the 

is selected in step S760. The connection v is identified in register Pivot_J*eg_j 58 of FIG. 6 is set to a TRUE value 

step S770, while a corresponding delay class w is identified in step SI 070, while the value in register Timestamp_ 

in step S780. After the packet tag is set to the current time Group_Reg_p_z 86 is incremented by (L/Rate_J*eg__p_ 

Current_Time+Class__Delay_Reg_w in step S790, it is z), where L is the packet length, in step S1080, and pro- 

determined whether or not the corresponding class queue w 15 cessing proceeds to step S1120. 

70 of FIG. 7 is empty in step S800. If class queue w is empty, On the other hand, in step S1040, if the register Nconn__ 

the register Timestamp„Reg_w 72 is set to a value con- Reg_p„z 90 does not contain a zero value, the register 

tained in the TAG of the data packet in step S810, and the Pivot__Re&_j 58 of FIG. 6 is set to a FALSE value in step 

process proceeds to step S820 where the data packet is S1090, and the connection j is inserted into Insertq__p__z 94 

inserted into class queue w 70. On the other hand, if class 20 in step S1100. Lastly, the value in register Ninserts_Reg_ 

queue w is not empty in step S800, the process proceeds p_z 100 is incremented by one in step SU10. 

directly to step S820. Following the insertion, if any selector If the register Pflag_J*eg_p 104 contains a TRUE value 

p 80 of FIG. 8 is ready in step S830 and any register in step SI 120, the process proceeds to step S1130. On the 

Nconn_Reg_p_q 90 contains a non-zero value in step other hand, if the register Pflag_Reg_p 104 does not 

S840, then the Class_p_Group_y 65 which has a value in 25 contain a TRUE value, then the register Timestamp_ 

register Nconn_Reg_p„y 90 greater than 0 and the small- Group__Reg_p_y 86 is incremented by (L/Nconn_Reg__ 

est value in register Timestamp_Group__Reg_p_y 86 is p 13 y*Rate_Reg_p_y) in step S1290, and the process 

selected in step S850, and the process proceeds to step S860. proceeds to step S1300. 

If either or both of the above two conditions as set forth in Now referring to FIG. 9F, steps related to the register 

steps S830 and S840 are not met, the process skips to step 30 Next_Reg_p_y 102 of FIG. 8 are handled. If the value in 

S1300. register Next„Reg_p_y 102 is not zero in step S1130, a 

Referring to FIG. 9D, the connection as specified by connection m at the tail of FIFO„p_„y 80 is identified in step 

Class_p_Group„_y 64 of FIG, 8 and selected in the above S1220, and the value in the corresponding register Pivot_ 

step S850 is processed as follows. In step S860, it is Reg__m 58 of FIG. 6 is set to TRUE in step S1230. 

determined whether or not the value stored in the register 35 Insertq_p_y 94 of FIG. 8 is appended to the tail of 

Timestamp_Group_Reg_p_y 86 is smaller than or equal FIFO_p_y 80 in step S1240, while the register Nconn_ 

to the current time Current_Time. If it is not, then the Reg_p_y 90 is set to the value in register Next_Reg__p_y 

process proceeds to step S1300. On the other hand, if the 102 in step S1250. The value in register Next_Reg__p_y 

value in the register Timestamp„Group_Reg_4)„y 86 is 102 is incremented by the value in register Ninserts_Reg_ 

smaller than or equal to the current time Current„Time in 40 p_100 in step S1260, and the register Ninserts_Reg_p_y 

step S860, a connection j at the top of the queue FIFO_p_y 100 is reset to zero in step S1270. Lastly, the value in register 

80 is selected in step S870, and the value in the register Timestamp_Group„_Reg_p_y 86 is incremented by 

Pivot__Reg__j 58 of FIG. 6 is stored in the flag register (L/Nconn_Reg„p y * /iafe _Reg_p_y) in step S1280, and the 

Pflag_Reg_p 104 of FIG. 8 in step S880. The correspond- process proceeds to step S1300. 

ing data packet is removed from the connection queue j 52 45 On the other hand, if the value in register Next_Reg_ 

of FIG, 6 in step S890. After the dequeuing, a value in the p_y 102 is zero in step S1130, Insertq_p_y 94 is appended 

corresponding TAG is also stored in the flag register Tflag__ to the tail of FIFO__p_y 80 in step S1140, while the value 

Reg_p 106 of FIG. 8 in step S900. Furthermore, a delay in register Nconn_Reg_p_y 90 is set to the value in 

class u for the corresponding connection j is identified in register Ninserts Reg p y 100 in step S1150, the value in 

step S910, and the TAG of the data packet is set to a value 50 register Next_Reg_p_y 102 is set to the value in register 

equal to Current_Time+Class_Delay_Reg_u 74 of FIG. 7 Ninserts_Reg_p_y 100 in step S1160, and the register 

in step S920. When the class queue u 70 is empty as Ninserts_Reg_p„y 100 is reset to zero in step S1170. 

determined in step S930, the register Hmestamp_Reg_u 72 Subsequently, it is determined in step S1180 whether or not 

is set to the TAG of the data packet in step S940. Otherwise, the value in register Nconn_Reg_p_y 90 is larger than 

the process proceeds to step S950, where the data packet is 55 zero. If not, the process proceeds to step S1300. On the other 

inserted into class queue u 70. Lastly, the connection j is hand, if the value in register Nconn_Reg__p_y 90 is larger 

removed from the FIFO__p_y 80 of FIG. 8 in step S960. than zero, a connection k at the head of FIFO„p_y 90 is 

Referring to FIG. 9E, additional steps are taken to handle identified in step S1190, and the corresponding register 

a backlog situation. It is determined whether or not the Pivot_Reg_k 58 of FIG. 6 is set to TRUE in step S1200. 

connection j is backlogged in step S970. If connection j is 60 Lastly, the value in register Timestamp_Group_Reg__p_y 

not backlogged, then register Backlog_Reg_j 54 of FIG. 6 86 of FIG. 8 is incremented by (L/Rate__Reg_p_y) in step 

is set to a FALSE value in step S980, the value in register S1210 before proceeding to step S1300. 

Next_Reg__p__y 102 of FIG. 8 is decremented by one in Referring to FIG. 9G, a data packet is sent via transmitter 

step S990, and processing proceeds to step S1120. Upon 50 of FIG. 5 in the following steps. In step S1300, it is 

confirming the backlog in step S970, the delay class and rate 65 determined whether or not transmitter 50 is available. If it is 

group corresponding to the register Tflag Reg p 106 are not available, the process returns to step S510. On the other 

used to identify Class_p Group z 65 in step S1000. If the hand, if the transmitter is available in step S1300, the 
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nonempty class queue b 70 of FIG. 7 with the smallest value discrete number, such as the disciplines described in F. M. 

in the corresponding register Timestamp Reg__b 72 is Chiussi, A. Francini and J. G. Kneuer, "Implementing Fair 

selected in step S1310. The data packet at the head of the Queueing in ATM Switches— Part 2: The Logarithmic Cal- 

selected class queue b 70 is sent to transmitter 50 in step endar Queue," Proc. GLOBECOM'97, pp. 519-525, 

S1320. After sending the data packet, it is determined 5 November 1997; J. C. R. Bennett, D. C. Stephens and H, 

whether or not class queue b 70 is empty in step S1330. If Zhang, "High Speed, Scalable, and Accurate Implementa- 

it is already empty, the process returns to step S510. In case tion °f Fair Q ueuein g Algorithms in ATM Networks," Proc. 

the queue is not yet empty, the value in register Timestamp_ ICNP'97, pp. 7-14, October 1997. 

Regjb 72 is set to the TAG value of the data packet at the £ accordance with another aspect of the invention the 

head of class queue b 70 in step S1340 before returning to 10 embodiments are identical to the first aspect described above 

if f except that rate jumps are handled differently. To be more 

s e f * t • j . , . £ . , specific, rate jumps are still performed only from a higher 

In summary a tag associated with the packet is used to ^ tQ a lower me However, if a packet for a connection 

determine whether or not the connection is to be jumped to arrives at mc scrver while the connection ^ stm backlogged 

a FIFO corresponding to a lower rate group. If so, it is and rcquC sts a shaping rate that is higher than the shaping 

inserted into the insertion queue corresponding to the new 15 rate requested by the previous packet for that connection, the 

rate group. The insertion queue is appended to the FIFO connection is treated as newly backlogged and immediately 

corresponding to the new rate group only when the pivot queued into the FIFO corresponding to the new rate group, 

corresponding to the new rate group has been selected by This implies that, at any given time, a connection may be 

selector 88 of FIG. 8. If, on the other hand, the tag indicates waiting in more than one FIFO. 

that a rate jump is not required, and the connection is still 20 This aspect of the invention requires some modifications 

backlogged, it is reinserted at the tail of the same FIFO. The in the embodiments of the first aspect of the invention, 

scheduler 68 of FIG. 7 maintains a class queue 70 of packets Within each connection controller 42 of FIG, 5, the register 

for every delay class. Scheduling is only performed among Shaper_Index_Reg 56 of FIG. 6 is replaced by a link list 

packets at the head of the class queues 70. If connection i of virtual connections. Each element of this link list corre- 

belongs to a particular delay class j, then every time a packet 25 sponds to an entry in the FIFOs corresponding to the rate 

belonging to connection i enters the scheduler 68 (from the groups, and holds the index of the rate group with which it 

shaper 63 or via the selector bypass unit 78), it is assigned is associated. Further, FIFO j i 80 of FIG. 8 can contain 

a timestamp ahead of the current time by the delay value virtual connection entries in it. This structure enables a 

associated with that delay class, and linked at the tail of the connection to be in multiple FIFOs at the same time by 

corresponding class queue j 70. Each delay queue is served 30 replicating into various virtual connections. The 

in a first-in-first-out manner since all packets in the queue arrangement, though not explicitly described here, can nev- 

have identical delay constraints. At each decision instant, the ertheless easily be devised from the principles stated above, 

scheduler 68 selects the packet with the minimum timestamp In one embodiment, the present invention is a system 

among the packets at the head of each class queue 70. This which services a plurality of queues associated with respec- 

timestamp value is stored in the corresponding register 35 tive data connections in a packet communication network 

Timestamp_Reg 72, and the sorting of the C timestamps is such that the system guarantees data transfer delays between 

accomplished by the sorter 76 within the scheduler 68. the data source and the destination of the data connections, 

FIG. 10 shows in flowchart form the computation of the This is achieved in two stages: the first stage shapes the 

delay incurred by an incoming packet in leaky bucket k. The traffic of each connection such that it conforms to a pre- 

leaky buckets together define the envelope on the connec- 40 specified envelope, while the second stage associates times- 

tion's traffic. Registers Sigma_k and Rho__k hold the token tamps with the packets released by the first stage and 

bucket size and release rate, respectively, for the leaky chooses for transmission from among them the one with the 

bucket, while register Last_Release_k denotes the release smallest timestamp. Both stages are associated with a dis- 

time for the last packet and register Token_Counter_k crete set of delays classes. The first stage employs one 

denotes the current token count. In step S2010, it is deter- 45 shaping structure per delay class. Each shaping structure in 

mined whether or not the current time Cur_Time is larger turn supports a discrete set of rates and employs a FIFO of 

than the value in register Last__Release_k. If it is not true, connections per supported rate. A connection may move 

the process proceeds to step S2040. Otherwise, in step between FIFOs corresponding to different rates as its rate 

S2020, the value in register Token_Counter_k is requirement changes. The second stage associates with each 

determined, and, in step S2030, the register Last_ 50 packet exiting the first stage a timestamp given by the exit 

Release_k is updated to the current time. Subsequently, in time from the first stage and the delay class to which the 

step S2040, the register Last_Release_k is incremented by connection belongs. A queue of packets is maintained per 

MAX (0, (L-Token__Counter)/Rho_k). After the register delay class, and the scheduler selects for transmission from 

Token_Counter__k is set to MAX (0, Token_Counter_k-L) among the packets at the head of the queues the one with the 

in step S2050, the delay incurred by the packet is returned 55 smallest timestamp. 

in the register Delay_k in step S2060. The overall delay In at least one embodiment, the present invention is a 
incurred by the packet in the system of multiple leaky method for shaping the data traffic of a plurality of 
buckets (which together describe the envelope for the connections, the connections traversing an associated corn- 
connection) is obtained as the maximum of the delays munication switch, each of the connections being associated 
incurred in the individual leaky buckets, and this maximiz- 60 with a respective set of shaping rates and a respective data 
ing leaky bucket also determines the shaping rate and hence transfer delay time, each of the shaping rates being one of a 
the rate group associated with the packet. first predetermined number of supported shaping rates, and 
In the above, although the operation of the shaper has each of the data transfer delay times being one of a second 
been described in the context of the implementation of an predetermined number of supported delay time classes, each 
RC-EDF server, similar principles may be used in the 65 of the connections being further associated with respective 
implementation of shapers for other scheduling disciplines, queues containing data packets. The method comprises the 
wherein the shaping rates supported by the shaper are a steps of: 
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(1) identifying for each data packet received via a plu- 
rality of data links, the respective one of the connec- 
tions and the associated one of the queues; 

(2) storing each of the received data packets in the 
identified queue; 

(3) computing for each of the received data packets the 
respective one of the shaping rates, and associating the 
computed shaping rate with the data packet; 

(4) identifying for each of the received data packets the 
respective one of the delay time classes; 

(5) associating a queue of connections with each connec- 
tion wherein the associated queue of data packets has at 
least one data packet waiting therein, the queue of 
connections being associated with one of the supported 
delay time classes and one of the supported shaping 
rates, the queue of connections being associated with 
the delay time class and shaping rate associated with 
the packet at the head of the queue of data packets 
associated with the connection; 

(6) associating a first timestamp with each queue of 
connections, including generating a new first times- 
tamp each time a connection enters an empty queue of 
connections, wherein a system time is used in the 
generation of the new first timestamp; 

(7) associating a cumulative service rate with each of the 
queues of connections, the cumulative service rate 
being used to generate the respective first timestamp 
associated with each of the queues of connections; 

(8) selecting one of the first timestamps associated with 
the queues of connections associated with each delay 
time class which have at least one connection waiting 
for service therein, and identifying the connections at 
the head of the queue of connections associated with 
the selected first timestamps as the recipient of the next 
service, the service including the steps of removing a 
data packet from the head of the queue associated with 
the identified connections, and transmitting the data 
packets; 

(9) determining if the queue of data packets associated 
with the serviced connection has data packets waiting 
therein, and storing the connection in the queue of 
connections associated with the delay time class and 
shaping rate associated with the packet at the head of 
the queue of data packets; 

(10) determining if the queue of connections associated 
with the serviced connection has connections waiting 
therein, and computing a new first timestamp associ- 
ated to the queue of connections using the associated 
cumulative service rate. 

In at least one embodiment, the set of shaping rates 
associated with each of the connections identifies an asso- 
ciated traffic envelope. 

In at least one embodiment, the shaping rates respectively 
correspond to a slope value corresponding to a piece of a 
piecewise linear envelope. 

In at least one embodiment, the selected data packets are 
transmitted to an output. 

In at least one embodiment, the queues of connections are 
first-in-first-out link lists. 

In at least one embodiment, the cumulative service rate 
associated with each of the queues of connections is equal to 
the number of connections in that queue times the shaping 
rate corresponding to that queue. 

In at least one embodiment, the cumulative service rate 
associated with each of the queues of connections is updated 
when a connection identified as pivot connection is served, 
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the pivot connection being one of the connections in the 
queue of connections. 

In at least one embodiment, the new first timestamp 
computed when a queue of connections is served and having 
data packets waiting therein is equal to the previous times- 
tamp incremented by the length of the data packet at the 
head of the queue of data packets associated with the 
connection at the head of the queue of connections, divided 
by the cumulative service rate. 

In at least one embodiment, a connection stored in a 
previously empty queue of connections is identified as the 
pivot connection. 

In at least one embodiment, the connection stored after the 
pivot connection in the associated queue of connections at 
the time when the pivot connection is served and is not 
stored back in the queue of connection is declared as the new 
pivot connection. 

In at least one embodiment, the invention further com- 
prises a method of guaranteeing a delay time to each 
connection, said method comprising the steps of: 

(a) associating a queue of data packets with each of the 
supported delay time classes; 

(b) associating each served packet with a second times- 
tamp equal to the system time plus the delay time 
associated with the corresponding connection, remov- 
ing the packet from the queue of packets associated 
with the connection and storing the removed data 
packet to the tail of the queue of data packets associated 
with the delay time class; and 

(c) selecting one of the second timestamps associated with 
the data packets at the head of the queues associated 
with the delay time classes which have at least one 
packet waiting therein as the recipient of the next 
service, the service including the steps of removing the 
data packet from the head of the queue associated with 
the delay bound class of the selected timestamp, and 
transmitting the removed data packet to an output. 

In at least one embodiment, the selected second times- 
tamp is the minimum of the timestamp associated with the 
data packets at the head of the queues associated with the 
delay time classes which have at least one packet waiting 
therein. 

The illustrative embodiments described above are but two 
examples of the principles that may be used to schedule the 
transmission of data packets according to the present inven- 
tion. Those skilled in the art will be able to devise numerous 
arrangements, which, although not explicitly shown or 
described herein, nevertheless embody those principles that 
are within the spirit and scope of the present invention. 

What is claimed is: 

1. A method of shaping data traffic of one or more 
connections traversing a: switch, the method comprising the 
steps of: 

a) determining a shaping rate and a transfer delay class for 
one or more data packets of a connection received at the 
switch; 

b) enqueueing (i) each data packet of the connection in a 
data queue associated with one or more connections 
having the same shaping rate and transfer delay class as 
the data packet and(ii) the connection in a correspond- 
ing connection queue associated with one or more 
connections having the same shaping rate and transfer 
delay class as the data packet; 

c) selecting for service, for a transfer delay class, a 
connection in a non-empty connection queue having 
the corresponding transfer delay class, based on a 
timestamp of the connection queue; 
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d) transmitting one or more data packets of the selected 
connection from the corresponding data queue; 

e) dequeueing the selected connection from the connec- 
tion queue; and 

f) enqueuing the dequeued connection back into the 
connection queue, if the dequeued connection has one 
or more remaining data packets in the corresponding 
data queue. 

2. The method as recited in claim 1, further comprising 
the step of g) updating a cumulative service rate and a 
timestamp of the connection queue of the enqueued selected 
connection, wherein 

the timestamp for the connection queue is determined 
from the cumulative service rate of the queue and a 
system potential of the switch when a connection is 
enqueued in a corresponding connection queue that is 
empty. 

3. The method as recited in claim 2, wherein, for step g), 
the cumulative service rate is updated for each service of a 
pivot connection enqueued in a corresponding connection 
queue. 

4. The method as recited in claim 2, wherein, for step g), 
the cumulative service rate of the queue is related to a 
combination of i) a number of connections in the connection 
queue and ii) the corresponding shaping rate of the connec- 
tion queue. 

5. The method as recited in claim 2, wherein, for step g), 
the timestamp is updated when the connection queue is 
served and remains un-empty, and the updated timestamp is 
computed as i) the timestamp that is incremented by a length 
of the data packet at the head of the data queue associated 
with the connection at the head of the corresponding con- 
nection queue that is served, which incremented timestamp 
is then divided by the cumulative service rate. 

6. The method as recited in claim 5, wherein a connection 
enqueued in a empty connection queue is a pivot connection, 
and a connection stored after the pivot connection is set as 
a new pivot connection if the pivot connection is dequeued 
and not enqueued after the pivot connection is served in step 

7. The method as recited in claim 1, wherein, for step a), 
each shaping rate corresponds to a traffic envelope associ- 
ated with data packets of a connection. 

8. The method as recited in claim 1, wherein, for step a), 
each shaping rate corresponds to a value of slope for a 
segment of a piece-wise linear traffic envelope. 

9. The method as recited in claim 1, wherein, for steps b) 
through f), each connection queue is a first-in, first-out 
linked list. 

10. The method as recited in claim 1, further comprising 
the step of g) guaranteeing a delay time to each connection, 
step g) comprising the steps of: 

gl) associating a data queue with each delay class, each 50 
data queue having data packets of a corresponding 
connection with the determined transfer delay class of 
step a); 

g2) associating a second timestamp for each data queue 
for data packets of the connection serviced with steps 55 
c) and d) Wherein the second timestamp is set as a 
combination of the system potential and the delay time 
of the delay class associated with the serviced data 
packet; 

g3) dequeuing the serviced data packet from the corre- 60 

sponding connection queue; 
g4) enqueuing the service data packet at the tail of the data 

queue with the delay class equivalent to the delay class 

of the corresponding connection queue; 
g5) selecting a data packet from one of the data queues 65 

based on the second timestamp; and 
gl) transmitting the selected data packet. 



11. The method as recited in claim 10, wherein the 
selected second timestamp is a relative minimum timestamp 
of the non-empty connection queues, wherein each non- 
empty connection queue is a connection queue associated 
with the data packets at the head of the data queues. 

12. Apparatus for shaping data traffic of one or more 
connections traversing a switch, the apparatus comprising: . 

a shaper determining a shaping rate and a transfer delay 
class for one or more data packets of a connection 
received at the switch; and 
a server enqueueing (i) each data packet of the connection 
in a data queue associated with one or more connec- 
tions having the same shaping rate and transfer delay 
class as the data packet and (ii) the connection in a 
corresponding connection queue associated with one or 
more connections having the same shaping rate and 
transfer delay class as the data packet, 
wherein the server i) selects for service, for a transfer 
delay class, a connection in a non-empty connection 
queue having the corresponding transfer delay class, 
based on a timestamp of the connection queue, and ii) 
causes one or more data packets of the selected con- 
nection to be transmitted from the corresponding data 
queue; and 

wherein the server 1) dequeues the selected connection 
from the connection queue; and 2) enqueues the 
dequeued connection back into the connection queue, if 
the dequeued connection has one or more remaining 
data packets in the corresponding data queue. 

13. The apparatus as recited in claim 12, wherein the 
30 shaper updates a cumulative service rate and a timestamp of 

the connection queue of the enqueued selected connection, 
the timestamp for the connection queue determined from the 
cumulative service rate of the queue and a system potential 
of the switch when a connection is enqueued in a corre - 
35 sponding connection queue that is empty. 

14. The apparatus as recited in claim 12, wherein the 
apparatus is embodied in either a switch or a router of a 
packet network. 

15. A computer-readable medium having stored thereon a 
plurality of instructions, the plurality of instructions includ- 
ing instructions which, when executed by a processor, cause 
the processor to implement a method for shaping data traffic 
of one or more connections traversing a switch, the method 
comprising the steps of: 

a) determining a shaping rate and a transfer delay class for 
one or more data packets of a connection received at the 
switch; 

b) enqueueing (i) each data packet of the connection in a 
data queue associated with one or more connections 
having the same shaping rate and transfer delay class as 
the data packet and (ii) the connection in a correspond- 
ing connection queue associated with one or more 
connections having the same shaping rate and transfer 
delay class as the data packet; 

c) selecting for service, for a transfer delay class, a 
connection in a non-empty connection queue having 
the corresponding transfer delay class, based on a 
timestamp of the connection queue; 

d) transmitting one or more data packets of the selected 
connection from the corresponding data queue; 

e) dequeueing the selected connection from the connec- 
tion queue; and 

f) enqueuing the dequeued connection back into the 
connection queue, if the dequeued connection has one 
or more remaining data packets in the corresponding 
data queue. 
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